{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(sphx_glr_tutorial_tvmc_command_line_driver.py)=\n",
    "# 用 TVMC 编译和优化模型\n",
    "\n",
    "原作者：[Leandro Nunes](https://github.com/leandron), [Matthew Barrett](https://github.com/mbaret), [Chris Hoge](https://github.com/hogepodge)\n",
    "\n",
    "在本节中，将使用 TVMC，即 TVM 命令行驱动程序。TVMC 工具，它暴露了 TVM 的功能，如 auto-tuning、编译、profiling 和通过命令行界面执行模型。\n",
    "\n",
    "在完成本节内容后，将使用 TVMC 来完成以下任务：\n",
    "\n",
    "* 为 TVM 运行时编译预训练 ResNet-50 v2 模型。\n",
    "* 通过编译后的模型运行真实图像，并解释输出和模型的性能。\n",
    "* 使用 TVM 在 CPU 上调优模型。\n",
    "* 使用 TVM 收集的调优数据重新编译优化模型。\n",
    "* 通过优化后的模型运行图像，并比较输出和模型的性能。\n",
    "\n",
    "本节的目的是让你了解 TVM 和 TVMC 的能力，并为理解 TVM 的工作原理奠定基础。\n",
    "\n",
    "## 使用 TVMC\n",
    "\n",
    "TVMC 是 Python 应用程序，是 TVM Python 软件包的一部分。当你使用 Python 包安装 TVM 时，你将得到 TVMC 作为命令行应用程序，名为 ``tvmc``。这个命令的位置将取决于你的平台和安装方法。\n",
    "\n",
    "另外，如果你在 ``$PYTHONPATH`` 上将 TVM 作为 Python 模块，你可以通过可执行的 python 模块 ``python -m tvm.driver.tvmc`` 访问命令行驱动功能。\n",
    "\n",
    "为简单起见，本教程将提到 TVMC 命令行使用 ``tvmc <options>``，但同样的结果可以用 ``python -m tvm.driver.tvmc <options>``。\n",
    "\n",
    "你可以使用帮助页面查看："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "usage: tvmc [--config CONFIG] [-v] [--version] [-h]\n",
      "            {micro,run,tune,compile} ...\n",
      "\n",
      "TVM compiler driver\n",
      "\n",
      "options:\n",
      "  --config CONFIG       configuration json file\n",
      "  -v, --verbose         increase verbosity\n",
      "  --version             print the version and exit\n",
      "  -h, --help            show this help message and exit.\n",
      "\n",
      "commands:\n",
      "  {micro,run,tune,compile}\n",
      "    micro               select micro context.\n",
      "    run                 run a compiled module\n",
      "    tune                auto-tune a model\n",
      "    compile             compile a model.\n",
      "\n",
      "TVMC - TVM driver command-line interface\n"
     ]
    }
   ],
   "source": [
    "!python -m tvm.driver.tvmc --help"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "``tvmc`` 可用的 TVM 的主要功能来自子命令 ``compile`` 和 ``run``，以及 ``tune``。要了解某个子命令下的具体选项，请使用 ``tvmc <subcommand> --help``。将在本教程中逐一介绍这些命令，但首先需要下载预训练模型来使用。\n",
    "\n",
    "## 获得模型\n",
    "\n",
    "在本教程中，将使用 ResNet-50 v2。ResNet-50 是卷积神经网络，有 50 层深度，设计用于图像分类。将使用的模型已经在超过一百万张图片上进行了预训练，有 1000 种不同的分类。该网络输入图像大小为 224x224。如果你有兴趣探究更多关于 ResNet-50 模型的结构，建议下载 [Netron](https://netron.app)，它免费提供的 ML 模型查看器。\n",
    "\n",
    "在本教程中，将使用 ONNX 格式的模型。\n",
    "\n",
    "```bash\n",
    "wget https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet50-v2-7.onnx\n",
    "```"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```{admonition} 支持的模型格式\n",
    "TVMC 支持用 Keras、ONNX、TensorFlow、TFLite 和 Torch 创建的模型。如果你需要明确地提供你所使用的模型格式，请使用选项 ``tvm.driver.tvmc compile --model-format``。\n",
    "```\n",
    "\n",
    "更多信息见 `python -m tvm.driver.tvmc compile --help`。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```{admonition} 为 TVM 添加 ONNX 支持\n",
    "TVM 依赖于你系统中的 ONNX python 库。你可以使用 ``pip3 install --user onnx onnxoptimizer`` 命令来安装 ONNX。如果你有 root 权限并且想全局安装 ONNX，你可以去掉 ``--user`` 选项。对 ``onnxoptimizer`` 的依赖是可选的，仅用于 ``onnx>=1.9``。\n",
    "```\n",
    "\n",
    "## 将 ONNX 模型编译到 TVM 运行时中\n",
    "\n",
    "一旦下载了 ResNet-50 模型，下一步就是对其进行编译。为了达到这个目的，将使用 ``tvmc compile``。从编译过程中得到的输出是模型的 TAR 包，它被编译成目标平台的动态库。可以使用 TVM 运行时在目标设备上运行该模型。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "WARNING:autotvm:One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.\n"
     ]
    }
   ],
   "source": [
    "# 这可能需要几分钟的时间，取决于你的机器\n",
    "!python -m tvm.driver.tvmc compile \\\n",
    "--target \"llvm\" \\\n",
    "--input-shapes \"data:[1,3,224,224]\" \\\n",
    "--output build/resnet50-v2-7-tvm.tar \\\n",
    "params/resnet50-v2-7.onnx"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "查看 ``tvmc compile`` 在 module 中创建的文件："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mod.so\n",
      "mod.json\n",
      "mod.params\n"
     ]
    }
   ],
   "source": [
    "%%bash\n",
    "mkdir models\n",
    "tar -xvf build/resnet50-v2-7-tvm.tar -C models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "列出了三个文件：\n",
    "\n",
    "* ``mod.so`` 是模型，表示为 C++ 库，可以被 TVM 运行时加载。\n",
    "* ``mod.json`` 是 TVM Relay 计算图的文本表示。\n",
    "* ``mod.params`` 是包含预训练模型参数的文件。\n",
    "\n",
    "该 module 可以被你的应用程序直接加载，而 model 可以通过 TVM 运行时 API 运行。\n",
    "\n",
    "```{admonition} 定义正确的 target\n",
    "指定正确的目标（选项 ``--target``）可以对编译后的模块的性能产生巨大的影响，因为它可以利用目标上可用的硬件特性。\n",
    "  \n",
    "欲了解更多信息，请参考 [为 x86 CPU 自动调优卷积网络](tune_relay_x86)。建议确定你运行的是哪种 CPU，以及可选的功能，并适当地设置目标。\n",
    "```\n",
    "\n",
    "## 用 TVMC 从编译的模块中运行模型\n",
    "\n",
    "已经将模型编译到模块，可以使用 TVM 运行时来进行预测。\n",
    "\n",
    "\n",
    "TVMC 内置了 TVM 运行时，允许你运行编译的 TVM 模型。为了使用 TVMC 来运行模型并进行预测，需要两样东西：\n",
    "\n",
    "- 编译后的模块，我们刚刚生成出来。\n",
    "- 对模型的有效输入，以进行预测。\n",
    "\n",
    "当涉及到预期的张量形状、格式和数据类型时，每个模型都很特别。出于这个原因，大多数模型需要一些预处理和后处理，以确保输入是有效的，并解释输出结果。TVMC 对输入和输出数据都采用了 NumPy 的 ``.npz`` 格式。这是得到良好支持的 NumPy 格式，可以将多个数组序列化为文件。\n",
    "\n",
    "作为本教程的输入，将使用一只猫的图像，但你可以自由地用你选择的任何图像来代替这个图像。\n",
    "\n",
    "### 输入预处理\n",
    "\n",
    "对于 ResNet-50 v2 模型，预期输入是 ImageNet 格式的。下面是为 ResNet-50 v2 预处理图像的脚本例子。\n",
    "\n",
    "你将需要安装支持的 Python 图像库的版本。你可以使用 ``pip3 install --user pillow`` 来满足脚本的这个要求。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "#!python ./preprocess.py\n",
    "from tvm.contrib.download import download_testdata\n",
    "from PIL import Image\n",
    "import numpy as np\n",
    "\n",
    "# 获取图片\n",
    "img_url = \"https://s3.amazonaws.com/model-server/inputs/kitten.jpg\"\n",
    "img_path = download_testdata(img_url, \"imagenet_cat.png\", module=\"data\")\n",
    "\n",
    "with Image.open(img_path) as im:\n",
    "    # 缩放图片到 224x224\n",
    "    resized_image = im.resize((224, 224))\n",
    "    img_data = np.asarray(resized_image).astype(\"float32\")\n",
    "# 转换为 ONNX 期望 NCHW 输入\n",
    "img_data = np.transpose(img_data, (2, 0, 1))\n",
    "# 归一化到 ImageNet 分布\n",
    "imagenet_mean = np.array([0.485, 0.456, 0.406])\n",
    "imagenet_stddev = np.array([0.229, 0.224, 0.225])\n",
    "norm_img_data = np.zeros(img_data.shape).astype(\"float32\")\n",
    "for i in range(img_data.shape[0]):\n",
    "    norm_img_data[i, :, :] = (img_data[i, :, :] / 255 - imagenet_mean[i]) / imagenet_stddev[i]\n",
    "# 添加 batch 维度\n",
    "img_data = np.expand_dims(norm_img_data, axis=0)\n",
    "# 保存预处理后数据（格式为 .npz）\n",
    "np.savez(\"build/imagenet_cat\", data=img_data)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 运行已编译的模块\n",
    "\n",
    "有了模型和输入数据，可以运行 TVMC 来做预测："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2023-03-17 14:53:10.700 INFO load_module /tmp/tmp5xszoh9l/mod.so\n"
     ]
    }
   ],
   "source": [
    "!python -m tvm.driver.tvmc run \\\n",
    "--inputs build/imagenet_cat.npz \\\n",
    "--output build/predictions.npz \\\n",
    "build/resnet50-v2-7-tvm.tar"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "回顾一下， ``.tar`` 模型文件包括 C++ 库，对 Relay 模型的描述，以及模型的参数。TVMC 包括 TVM 运行时，它可以加载模型并根据输入进行预测。当运行上述命令时，TVMC 会输出新文件，``predictions.npz``，其中包含 NumPy 格式的模型输出张量。\n",
    "\n",
    "在这个例子中，在用于编译的同一台机器上运行该模型。在某些情况下，可能想通过 RPC Tracker 远程运行它。要阅读更多关于这些选项的信息，请查看："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python -m tvm.driver.tvmc run --help"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 输出后处理\n",
    "\n",
    "如前所述，每个模型都会有自己的特定方式来提供输出张量。\n",
    "\n",
    "需要运行一些后处理，利用为模型提供的查找表，将 ResNet-50 v2 的输出渲染成人类可读的形式。\n",
    "\n",
    "下面的脚本显示了后处理的例子，从编译的模块的输出中提取标签。\n",
    "\n",
    "运行这个脚本应该产生以下输出："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "class='n02123045 tabby, tabby cat' with probability=0.610552\n",
      "class='n02123159 tiger cat' with probability=0.367180\n",
      "class='n02124075 Egyptian cat' with probability=0.019365\n",
      "class='n02129604 tiger, Panthera tigris' with probability=0.001273\n",
      "class='n04040759 radiator' with probability=0.000261\n"
     ]
    }
   ],
   "source": [
    "#!python ./postprocess.py\n",
    "import os.path\n",
    "import numpy as np\n",
    "from scipy.special import softmax\n",
    "from tvm.contrib.download import download_testdata\n",
    "\n",
    "# 下载标签列表\n",
    "labels_url = \"https://s3.amazonaws.com/onnx-model-zoo/synset.txt\"\n",
    "labels_path = download_testdata(labels_url, \"synset.txt\", module=\"data\")\n",
    "\n",
    "with open(labels_path, \"r\") as f:\n",
    "    labels = [l.rstrip() for l in f]\n",
    "\n",
    "output_file = \"build/predictions.npz\"\n",
    "\n",
    "# 打开输出并读取输出张量\n",
    "if os.path.exists(output_file):\n",
    "    with np.load(output_file) as data:\n",
    "        scores = softmax(data[\"output_0\"])\n",
    "        scores = np.squeeze(scores)\n",
    "        ranks = np.argsort(scores)[::-1]\n",
    "        for rank in ranks[0:5]:\n",
    "            print(f\"class='{labels[rank]}' with probability={scores[rank]:f}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "试着用其他图像替换猫的图像，看看 ResNet 模型会做出什么样的预测。\n",
    "\n",
    "## 自动调优 ResNet 模型\n",
    "\n",
    "之前的模型是为了在 TVM 运行时工作而编译的，但不包括任何特定平台的优化。在本节中，将展示如何使用 TVMC 建立针对你工作平台的优化模型。\n",
    "\n",
    "在某些情况下，当使用编译模块运行推理时，可能无法获得预期的性能。在这种情况下，可以利用自动调优器，为模型找到更好的配置，获得性能的提升。TVM 中的调优是指对模型进行优化以在给定目标上更快地运行的过程。这与训练或微调不同，因为它不影响模型的准确性，而只影响运行时的性能。作为调优过程的一部分，TVM 将尝试运行许多不同的算子实现变体，以观察哪些算子表现最佳。这些运行的结果被存储在调优记录文件中，这最终是 ``tune`` 子命令的输出。\n",
    "\n",
    "在最简单的形式下，调优要求你提供三样东西：\n",
    "\n",
    "- 你打算在这个模型上运行的设备的目标规格\n",
    "- 输出文件的路径，调优记录将被保存在该文件中\n",
    "- 最后是要调优的模型的路径。\n",
    "\n",
    "默认搜索算法需要 `xgboost`，请参阅下面关于优化搜索算法的详细信息：\n",
    "\n",
    "```bash\n",
    "pip install xgboost cloudpickle\n",
    "```\n",
    "\n",
    "GPU 版本：\n",
    "\n",
    "```bash\n",
    "conda install -c conda-forge py-xgboost-gpu\n",
    "pip install cloudpickle\n",
    "```"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````{note}\n",
    "直接运行调优可能会跑不通：\n",
    "```bash\n",
    "python -m tvmc tune --target \"llvm\" \\\n",
    "--output build/resnet50-v2-7-autotuner_records.json \\\n",
    "params/resnet50-v2-7.onnx\n",
    "```\n",
    "参考 [issuue 13431](https://discuss.tvm.apache.org/t/error-when-trying-to-tune-the-resnet-model/13431) 解决 `tvmc tune` resnet50 ERROR 的问题。\n",
    "\n",
    "````"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "import onnx\n",
    "\n",
    "onnx_model = onnx.load_model('params/resnet50-v2-7.onnx')\n",
    "onnx_model.graph.input[0].type.tensor_type.shape.dim[0].dim_value = 1\n",
    "onnx_model.graph.output[0].type.tensor_type.shape.dim[0].dim_value = 1\n",
    "onnx.checker.check_model(onnx_model)\n",
    "onnx.save(onnx_model, 'params/resnet50-v2-7-frozen.onnx')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在这个例子中，如果你为 ``--target`` 标志指出更具体的目标，你会看到更好的结果。\n",
    "\n",
    "TVMC 将对模型的参数空间进行搜索，尝试不同的运算符配置，并选择在你的平台上运行最快的一个。尽管这是基于 CPU 和模型操作的指导性搜索，但仍可能需要几个小时来完成搜索。这个搜索的输出将被保存到 ``resnet50-v2-7-autotuner_records.json`` 文件中，以后将被用来编译优化的模型。\n",
    "\n",
    "```{admonition} 定义调优搜索算法\n",
    "默认情况下，这种搜索是使用 ``XGBoost Grid`` 算法引导的。根据你的模型的复杂性和可利用的时间，你可能想选择不同的算法。完整的列表可以通过查阅：\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python -m tvm.driver.tvmc tune --help"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于消费级 Skylake CPU 来说，输出结果将是这样的："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Task  1/25]  Current/Best:  272.46/ 493.23 GFLOPS | Progress: (40/40) | 21.48 s Done.\n",
      "[Task  2/25]  Current/Best:  152.39/ 440.48 GFLOPS | Progress: (40/40) | 15.23 s Done.\n",
      "[Task  3/25]  Current/Best:  184.53/ 542.38 GFLOPS | Progress: (40/40) | 15.20 s Done.\n",
      "[Task  4/25]  Current/Best:  241.18/ 407.57 GFLOPS | Progress: (40/40) | 17.54 s Done.\n",
      "[Task  5/25]  Current/Best:  182.73/ 464.18 GFLOPS | Progress: (40/40) | 15.63 s Done.\n",
      "[Task  6/25]  Current/Best:  536.16/ 536.16 GFLOPS | Progress: (40/40) | 16.13 s Done.\n",
      "[Task  7/25]  Current/Best:  214.60/ 392.14 GFLOPS | Progress: (40/40) | 15.98 s Done.\n",
      "[Task  8/25]  Current/Best:  281.15/ 583.36 GFLOPS | Progress: (40/40) | 19.38 s Done.\n",
      "[Task  9/25]  Current/Best:  146.96/ 399.98 GFLOPS | Progress: (40/40) | 17.36 s Done.\n",
      "[Task 10/25]  Current/Best:   60.62/ 403.58 GFLOPS | Progress: (40/40) | 15.26 s Done.\n",
      "[Task 11/25]  Current/Best:  190.37/ 558.11 GFLOPS | Progress: (40/40) | 15.91 s Done.\n",
      "[Task 12/25]  Current/Best:  204.62/ 511.79 GFLOPS | Progress: (40/40) | 17.37 s Done.\n",
      "[Task 13/25]  Current/Best:  199.71/ 448.21 GFLOPS | Progress: (40/40) | 16.20 s Done.\n",
      "[Task 14/25]  Current/Best:  157.68/ 488.08 GFLOPS | Progress: (40/40) | 17.07 s Done.\n",
      "[Task 15/25]  Current/Best:  228.61/ 483.70 GFLOPS | Progress: (40/40) | 17.03 s Done.\n",
      "[Task 16/25]  Current/Best:  149.53/ 461.08 GFLOPS | Progress: (40/40) | 15.08 s Done.\n",
      "[Task 17/25]  Current/Best:  178.52/ 532.27 GFLOPS | Progress: (40/40) | 15.48 s Done.\n",
      "[Task 18/25]  Current/Best:   66.78/ 530.63 GFLOPS | Progress: (40/40) | 16.16 s Done.\n",
      "[Task 19/25]  Current/Best:   44.99/ 436.72 GFLOPS | Progress: (40/40) | 17.55 s Done.\n",
      "[Task 20/25]  Current/Best:  159.20/ 478.63 GFLOPS | Progress: (40/40) | 18.21 s Done.\n",
      "[Task 21/25]  Current/Best:  177.36/ 469.23 GFLOPS | Progress: (40/40) | 18.89 s Done.\n",
      "[Task 22/25]  Current/Best:  384.79/ 439.12 GFLOPS | Progress: (40/40) | 15.84 s Done.\n",
      "[Task 23/25]  Current/Best:  197.92/ 517.98 GFLOPS | Progress: (40/40) | 17.45 s Done.\n",
      "[Task 25/25]  Current/Best:    0.80/  52.66 GFLOPS | Progress: (40/40) | 32.85 s Done.\n",
      " Done.\n"
     ]
    }
   ],
   "source": [
    "!python -m tvm.driver.tvmc tune \\\n",
    "--target \"llvm -mcpu=broadwell\" \\\n",
    "--output build/resnet50-v2-7-autotuner_records.json \\\n",
    "params/resnet50-v2-7-frozen.onnx"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "调谐会话可能需要很长的时间，所以 ``tvmc tune`` 提供了许多选项来定制你的调谐过程，在重复次数方面（例如 ``--repeat`` 和 ``--number``），要使用的调优算法等等。\n",
    "\n",
    "## 用调优数据编译优化后的模型\n",
    "\n",
    "作为上述调谐过程的输出，获得了存储在 ``resnet50-v2-7-autotuner_records.json`` 的调谐记录。这个文件可以有两种使用方式：\n",
    "\n",
    "- 作为进一步调谐的输入（通过 ``tvmc tune --tuning-records``）。\n",
    "- 作为对编译器的输入\n",
    "\n",
    "编译器将使用这些结果来为你指定的目标上的模型生成高性能代码。要做到这一点，可以使用 ``tvmc compile --tuning-records``。\n",
    "\n",
    "获得更多信息："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python -m tvm.driver.tvmc compile --help"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "现在，模型的调谐数据已经收集完毕，可以使用优化的算子重新编译模型，以加快计算速度。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python -m tvm.driver.tvmc  compile \\\n",
    "--target \"llvm\" \\\n",
    "--tuning-records build/resnet50-v2-7-autotuner_records.json  \\\n",
    "--output build/resnet50-v2-7-tvm_autotuned.tar \\\n",
    "params/resnet50-v2-7-frozen.onnx"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "验证优化后的模型是否运行并产生相同的结果："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2023-03-17 15:50:05.484 INFO load_module /tmp/tmpk_0v6k7d/mod.so\n"
     ]
    }
   ],
   "source": [
    "!python -m tvm.driver.tvmc run \\\n",
    "--inputs build/imagenet_cat.npz \\\n",
    "--output build/predictions.npz \\\n",
    "build/resnet50-v2-7-tvm_autotuned.tar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "class='n02123045 tabby, tabby cat' with probability=0.610553\n",
      "class='n02123159 tiger cat' with probability=0.367179\n",
      "class='n02124075 Egyptian cat' with probability=0.019365\n",
      "class='n02129604 tiger, Panthera tigris' with probability=0.001273\n",
      "class='n04040759 radiator' with probability=0.000261\n"
     ]
    }
   ],
   "source": [
    "!python postprocess.py"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 比较已调谐和未调谐的模型\n",
    "\n",
    "TVMC 提供了在模型之间进行基本性能基准测试的工具。你可以指定重复次数，并且 TVMC 报告模型的运行时间（与运行时间的启动无关）。可以粗略了解调谐对模型性能的改善程度。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2023-03-17 15:52:07.029 INFO load_module /tmp/tmp1nt090vr/mod.so\n",
      "Execution time summary:\n",
      " mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  \n",
      "  43.1426      43.0816      49.2847      40.7562       1.3496   \n",
      "               \n"
     ]
    }
   ],
   "source": [
    "!python -m tvm.driver.tvmc run \\\n",
    "--inputs build/imagenet_cat.npz \\\n",
    "--output build/predictions.npz  \\\n",
    "--print-time \\\n",
    "--repeat 100 \\\n",
    "build/resnet50-v2-7-tvm_autotuned.tar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2023-03-17 15:52:49.358 INFO load_module /tmp/tmpfvn7lje9/mod.so\n",
      "Execution time summary:\n",
      " mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  \n",
      "  49.7214      48.9426      60.2221      46.6976       2.2708   \n",
      "               \n"
     ]
    }
   ],
   "source": [
    "!python -m tvm.driver.tvmc  run \\\n",
    "--inputs build/imagenet_cat.npz \\\n",
    "--output build/predictions.npz  \\\n",
    "--print-time \\\n",
    "--repeat 100 \\\n",
    "build/resnet50-v2-7-tvm.tar"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 小结\n",
    "\n",
    "在本教程中，介绍了 TVMC，用于 TVM 的命令行驱动。演示了如何编译、运行和调优模型。还讨论了对输入和输出进行预处理和后处理的必要性。在调优过程之后，演示了如何比较未优化和优化后的模型的性能。\n",
    "\n",
    "这里介绍了使用 ResNet-50 v2 本地的简单例子。然而，TVMC 支持更多的功能，包括交叉编译、远程执行和剖析/基准测试（profiling/benchmarking）。\n",
    "\n",
    "要想知道还有哪些可用的选项，请看 ``tvmc --help``。\n",
    "\n",
    "在 [用 Python 接口编译和优化模型](tvmc_python) 教程中，将使用 Python 接口介绍同样的编译和优化步骤。"
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "f0a0fcc4cb7375f8ee907b3c51d5b9d65107fda1aab037a85df7b0c09b870b98"
  },
  "kernelspec": {
   "display_name": "Python 3.10.4 ('tvm-mxnet': conda)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.9"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
